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Brittle System Analysis 



Stephen F. Bush', John Hershey and Kirby Vosburgh 



Abstract 

The goal of this paper is to define and analyze systems which exhibit brittle behavior This behavior is characterized by a sudden and steep decline in 
performance as the system state changes. This can be due to input parameters which exceed a specified input, or environmental conditions which exceed specified 
operating boundaries. 
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I. Introduction 



BASED on vast experience watching the fruit of my hard work fall apart time and again, I feel highly qualified to discuss the 
manner in which systems break. In particular, the goal of this paper is to define and analyze systems which exhibit brittle 
behavior This behavior is characterized by a sudden and steep decline in performance as system state changes as shown by 
point D along curve Ph in Figure |l]. Ph is the performance curve for a high performance system with brittle characteristics. Pi 
is a lower performance system with less brittle characteristics. Clearly the slope from point D along curve P/, is much steeper 
than that of point E along curve Pi. The steep decline of performance along Ph can be due to input parameters which exceed 
a specified tolerance, or environmental conditions which exceed specified operating boundaries. This is equivalent to material 
fracture. Materials science provides a terminology which is apropos and flexible enough to describe the characteristics of this 
work. A table of materials science terms and their corresponding brittle system definitions is shown in Table |. Toughness [?] is the 
amount of energy absorbed by a material prior to failure. A brittle fracture occurs with very little energy absorption while a ductile 
fracture is accompanied by much energy absorption. Clearly toughness is the analog of the robustness of a system. To carry the 
analogy further, ductility is quantified as the amount of permanent strain prior to fracture. A system which does not exhibit brittle 
behavior will be called ductile [?]. Strain is unit-less and refers to the amount of deformation per unit length of a material and is 
caused by stress which is the force per unit area. Material deformation is analogous to degradation in a brittle system. In our work, 
stress is the distance by which a parameter exceeds its specified operating tolerance. There are two forms of strain, reversible and 
permanent. Reversible strain is called elastic strain and is characterized by Young's modulus: the ratio of the stress over the strain. 
Permanent strain leaves the shape of a material permanently changed and is known as plastic strain. In a brittle system, plastic 
strain will be degradation from which the system cannot recover, while a brittle system can recover from reversible strain. 
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Fig. 1 

A Brittle versus Ductile System. 



Increasing both hardness and ductility increases the toughness of a material. Hardness is increased by deforming the crystal 
structure, either by adding impurities to a homogeneous material or by rapid cooling of the material after processing. In this 
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work, increasing hardness of a material is analogous to increasing the gain of the sub-components of a system. Previous work 
has focused on the hardness of a system, but relatively little on the ductility. For example, in choosing design parameters for a 
system [?], one examines the effect of high and low parameter values within the utility of normal operation of system performance 
(U{normalOperation)) and chooses those values which result in the best performance (Ph)- However, the behavior and utility 
of the system when tolerance is exceeded (U (robust)) have rarely been examined. Certainly if time is considered, then based on 
simple reliability theory the utility is shown in Equation |l] where H and D are shown in Figure ^ a; is a design parameter, P[] is 
the probability of the event in the brackets, and U (normalOperation) is the utiUty to the user of the system in normal operation, 
and U (robust) is the utility to the user of the system outside normal operation. As graceful degradation becomes a more desirable 
feature, the utility of area D increases. Let us define brittleness as the ratio of the hardness over the ductility which is the area H 
over D in Figure ^. 
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Fig. 2 

A Definition of System Brittleness. 



Utility = P[x e T]HU (normalOperation) + P[x ^ T]DU(robust) (1) 
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Toughness 


system robustness 
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level of performance within tolerance 
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level of performance outside tolerance 


Plastic Strain 


system cannot recover from degradation 


Reversible Strain 


system can recover from degradation 


Brittle Fracture 


sudden steep decline in performance 


Ductile Fracture 


graceful degradation in performance 


Brittleness 


ratio of hardness over ductility 


Deformation 


degradation in performance 


Young's Modulus 


amount tolerance exceeded over degradation 



TABLE I 
Terminology. 



At this point it must be mentioned that some causes of brittle fracture may be more difficult to deal with than others. For example 
the sudden loss of performance can be due to a catastrophe [?], [?]. Catastrophe Theory is essentially the study of singularities; in 
this work it would be one of many causes for brittle behavior. The connection between Catastrophe Theory and Brittle Systems is 
only one of the many areas that need to be explored in this new research area. 
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II. Sensitivity 

As a first step, design parameters, Xp, which affect ductility must be identified. The sensitivity of ductility to a particular 
parameter is characterized by ijj, as shown in Equation ^ and ^ In Equation ^ 7 is the shaded area A — B in Figure which is 
a function of two values, xi and X2, of a single design parameter, Xp. In Equation ||, the sensitivity of ductility is defined as the 
rate of change of the difference of A — B. Figure |3| shows two curves for the same system, one curve which is brittle, the other 
robust. A function which returns the value of the ductile sensitivity is implemented in Mathematica [?] in Figure Q In Figure ^ tj: 
takes two arguments, si and s2 which are two values of a single design parameter, xi and X2. The Mathematica module returns 
the partial derivative of 7 as shown in the bottom of Figure ^ 

In Figure |], the value of the ductility sensitivity is shown for the system from Figure ^ as a function of the difference between 
Xi and X2. In Figure]^, X2 is constant and xi varies. As xi and X2 become equal, ^I* goes to zero. This is because the performance 
curves become the same and the area A — B disappears. Also, when the values of xi and X2 are far apart, the area A — B becomes 
large and the rate of change of the area becomes large. Note that because of the implementation of the Mathematica module which 
computes 7, the order of the arguments to the Mathematica function in Figure ^is significant. 



7(xi, ^2) = A - B 



(2) 



1p{Xi,X2) 



dj{xi,X2) 

dxi 



(3) 
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Fig. 3 

An Example System. 



III. System Energy 

As a digital system approaches the edges of it operating tolerance the energy required to maintain the performance increases as 
shown in Figure ^. Consider quality of service on a router in a communications network. The energy required to forward a packet 
is routinely modeled as directly proportional to the length of the packet. As the load increases, input queues begin to fill to capacity 
and packets are dropped because computational energy is not sufficient to keep up with the load. In this case, performance is the 
probability of not dropping a packet and energy is the processing power which is directly proportional to the packet service rate, ^. 
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lfi[Sl]:= ^[sl_, s2_] := Module[{x, y}, 

(D[Tf[J!, y], X] /. ix-ysl, y ->s2}) 

] 

ln[S5]:= !if[_s2, si] 

1. O.Ssl^ O.Ssl 0.666667 32 1. 32 (-1. si + 32)^'^ (-sl^ + 32^ ) 1. (si - 32) 32 V-1. si + s2 



Out[35]= 



^ 32='2 32"= 3l"= (-1.31^+1.32^) = " (-1. Sl^ + 1. S2=) 



2 ,3/3 



2 32 (-1. 3l + 32)"= V-1. sl + 32 (-31= + s2= ) 31-32 V-1. 3l + s2 



3(-l. 3l= + l. 32=)"= 2(-l. 31= +1. 32=)"= aV -1. 31.32 V-1. ^1. s2= ^1^-1- -2= 



Fig. 4 

Ductility Sensitivity Definition. 




In an M/M/1 queue, a direct relation between performance and energy is shown in Equation |4| where n is the expected queue size, 
N is the maximum queue capacity and P — ^- The result is graphed in Figure 0. 

P[n<iV]- (4) 





IV. Brittle Sub-Components 

Consider a system whose sub-components exhibit various degrees of ductility as defined above. Just as adding impurities to a 
pure metal causes it to become stronger but more brittle, the addition of more efficient but also more sensitive components to a 
system causes the system to increase performance within its operating range, but become less ductile. How do the effects of ductility 
propagate among the sub-components to influence the ductility of the entire system? Assume the performance response curve is 
known for each sub-component and that the output from one component feeds into the input of the next component as shown in 
Figure^. Assume that the sub-component output performance cannot be better than any of its inputs. Then the performance curve 
for the output of each sub-component is the minimum of the input sub-component performance curve and the current component 
performance curve. 

The hardness component of the brittleness enhances the performance when values are within tolerance and low ductility degrades 
the performance when values are out of tolerance. The amount of degradation depends on the amount by which the tolerance was 
exceeded. This is illustrated in Figure ^ and is stated in Equation ^, where b is the brittleness. Pin is the input performance, T is 
the set of in-tolerance values, a; is a state parameter, E[] is the expected value, and Pout is the output performance. The result of 
Equation ^ is plotted in Figure |l^ 
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System Performance and Energy. 
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Fig. 7 

Performance and Energy. 
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Distribution AND Performance. 



Pout = {P,n + P^nb)Prob[x G T] + {P.,n ' P,nbE[x - max[T]])Prob[x i T] (5) 

As b decreases in a non-brittle system, T increases. It is this relationship between b and T which is the principal focus of brittle 
systems analysis. Assume the simple case of a normally distributed performance distribution, then Equation^ shows how Equation 
Ijcan be refined. N{rj, a) is a normal distribution with an average of rj and variance of a and R.V.N(-q.ci) is a random variable with 
distribution N{rj^ a). 

Pout = {P^n + P,nb)Prob[R.V.N^^^„^ <T] + {P,n - 6ry)1.0 - Prob[R.V.N(^,a) < T] (6) 

As b decreases we assume that the system is non-brittle so that T increases. Assume that b is linear, then T increases as shown 
in Figure |ll] and Equation 0. 

1 

T =Px + -Py (7) 

A BONeS model has been developed to examine brittle sub-components as shown in Figure which models Figure ^. A BONeS 
data structure contains the performance or quality of the input to a component. A normal random number generator produces a 
value with a specified mean and variance, in this case 10.0 and 3.0 respectively. The difference between the random number and 
the upper limit (11.0) is computed. If the normal random number is greater than the upper limit, then the performance value of 
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Analytical Brittle Subcomponent Result. 
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the input data structure is reduced by the brittleness muhiplied by the amount by which the tolerance was exceeded. If the normal 
random number is within tolerance then the input data structure is increased by an amount proportional to the brittleness. 



Bsys2 [ 22-Jul-1 998 1 4:55:42 ] 
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Fig. 12 
BONES Model. 



The results are shown in Figure [T^ for the normal random number values, the upper limit, and the system performance as a 
function of the consecutive order in which each of the values were sampled. The brittleness is varied from zero to 0.8 and the 
results are averaged. Clearly the performance degrades when the normal values exceed the upper limit. Figure 



14 



shows the 

performance results for the sub-components and the entire system from Figure |12|. Components 1 and 3 generate data structures 
with a performance value of one. An intermediate component. Component 2, has a brittleness which varies from zero to 0.8. The 
final output component. Component 4, has a brittleness of 0.3. The analytical results from Figure |lo| and the simulated system 
performance curve from Figure |l^ are in close agreement. Although Component 2 performance improves when the brittleness 
is between 0.2 and 0.5, Component 4, which is the system performance, declines. This is because Component 4 performance 
depends on the minimum performance input which comes from Component 3, an initial input component that always generates a 
performance of one. 

If the ductility of sub-components can be controlled, how should the brittleness be adjusted among the sub-components? One 
hne of reasoning yields the result that in systems run near the maximum operating tolerance, better performance will be achieved 
with highly brittle components placed near the outputs of the system. This is because there is then less chance for the highly brittle 
components to effect the other sub-components. The next simulation, shown in Figure [ij examines this question. The brittleness 
of the first component is varied from zero to one and the second component remains at a brittleness of 0.5. The results are shown 
in Figure |l6[ The results are also shown in the same figure for the first component brittleness of 0.5 and the second component 
brittleness varying from zero to one. Figure |l6| indicates that the best performance curve results when the more highly brittle 
component is the last component in the chain. 



V. An Example of Ductility in a Communications Network 

The following applications which exhibit brittle behavior have been chosen as simple examples so that the ideas presented in 
this work, rather than the details of the applications, can be investigated. These examples will be examined in more detail as this 
work progresses. 



A. Adaptive Multimedia 

Current network applications, especially multimedia applications, have performance which degrades rapidly after bandwidth is 
reduced beyond a certain point. In [?] it is suggested that if applications can be developed which degrade gracefully with respect to 
loss in bandwidth as shown in Figure |l^ then the network can be designed to maintain bandwidth within the required bounds on a 
best effort basis. A solution recommended in [?] is for the network to keep a certain amount of bandwidth in reserve. However, the 
more bandwidth kept in reserve, the less that remains to support the network as a whole. Thus the amount of reserve bandwidth is 
the greatest factor affecting ductility in this example. As the value of reserve bandwidth increases, the number of users which can 
be supported is reduced, but fewer calls in progress are disconnected. 



B. Packet Recovery: Stop And Wait System 

The second example is recovery from packet loss in an Automatic Repeat Request (ARQ) link shown in Figure |l8|. We consider 
two types of packets: packets with a large delay and packets which are lost. Setting a high time-out value results in better 
performance for packets which have a high ratio of delay to loss, but degrades rapidly as the ratio approaches zero. 
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Brittleness Location Experiment. 
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Fig. 16 

Brittleness Location Results. 



C. TDMA Reservation System 

Figure [l^ shows transmission rate versus probability of transmission for two values of retransmission. The lower valued setting 
for retransmission has higher performance, however, the higher valued retransmission setting is slightly more robust around a 
probability of 0.013. 

D. Mobile Cellular Telephone System 

Figure ^ shows grade of service versus channels per base station. 

E. Buffer Capacity 

Another example of a brittle system involves choosing buffer capacity in a data communications system. 

F. Backlogged Packets in Slotted ALOHA 

In Slotted ALOHA, data packet transmission occurs using equal sized packets within equally divided time slots. If two or 
more users transmit within a given time slot, a collision occurs; the packet will be retransmitted in a following time slot with a 
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Ductile Network Applications (Top Graph). 
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Delay, Probability of Error, and Timeout in a Stop And Wait System. 
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TDMA Probability of Transmission and Transmission Rate. 
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Grade of Service versus number of Channels per Base Station. 



given probability. This example of a brittle system exhibits catastrophic behavior [?]. Let pq be the probability that a packet to 
be transmitted finds an empty cell, and pi be the probability that after a collision, the cell attempts retransmission. The design 
parameters are po and pi and the number of packets waiting for retransmission is the state. A graph of po and pi forms a cusp and 
all the classic symptoms of catastrophe are present, namely, bifurcation, sudden jumps, hysteresis, inaccessibility, and divergence. 

G. Variable Window Flow Control 

H. Flow Control 

Also in [?], it is suggested that flow control, shown in Figure^, in a communications network exhibits not only brittle behavior, 
but catastrophic behavior The specific model of flow control considered in [?] is to divide available buffer space into classes and 
allow packets which have passed through i hops to occupy buffers assigned to class i. 



VI. Techniques for Handling Brittle Systems 

There are a variety of techniques for controlling and enhancing the ductility of a system. The first is to assign values to design 
parameters which influence ductility in a static manner, that is before the system becomes operational. The next involves dynam- 
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Variable Window Throughput Comparison. 



ically changing the ductihty as the system operates. This would be analogous to a material which could automatically trade-off 
hardness for ductility whenever necessary in order to maximize its performance. The remaining techniques involve methods of 
attempting to avoid brittle fracture, by design or by rolling back from a fracture. 

A. Ductility Setting of System Sub-Components 

Now that ductility has been defined and the design parameters controlling ductility identified, a natural question to ask is how 
should the sub-component parameters be set. Within normal operation, the performance requirements must be met, and in addition 
we would like the system to be tough (robust) outside the normal operating range as well. Is there a benefit to how ductility 
is distributed among subsystem components? As an example, in network and transport level data communications systems, if 
the system is going to fail, it is beneficial for low level system components to fail early in the transmission process rather than 
transporting a packet close to its destination and finding that the entire packet/frame has to be retransmitted later Thus, it would be 
better to set Xi, in Figure^ so that sub-component SI, which performs its processing early, has a lower ductility than components 
later in the process. 

A highly brittle component, as illustrated in Figure would appear to have the characteristics of an on-off constant bit rate 
(on-off CBR) source. These types of sources have been used to model ATM [?] traffic sources. Queue fill distribution has been 
analyzed in [?] for on-off CBR models. These results could be used in a buffer solution for such highly brittle components. 
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B. Adaptation 

As mentioned previously, there are two forms of strain, reversible and permanent. Reversible strain is called elastic strain 
and is characterized by Young's modulus: the ratio of the stress over the strain. Permanent strain leaves the shape of a material 
permanently changed and is known as plastic strain. In brittle systems, an analog to plastic strain is adaptation. Once we know the 
parameters which affect ductility, that is, having determined i/j, the values of the parameters can be changed dynamically as stress 
causes the system to approach a brittle fracture. 

C. Rollback 

Another possibility is that once the system approaches a brittle fracture, the system has the capability to rollback to a safe state 
and choose another gradient which attempts to remain in a safe state of operation. Rollback techniques within a communications 
network environment have been described in [?]. 
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